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ABSTRACT  C\\^W  C."*.  ~  \ 

Several  procedures  for  ranking  populations  according 
to  the  quantile  of  a  given  order  have  been  discussed  in  the 
literature.  These  procedures  deal  with  continuous  distribu¬ 
tions.  This  paper  deals  with  the  problem  of  selecting  a  popu¬ 
lation  with  the  largest  d-quantile  from  k  >,  2  finite  populations, 
where  the  size  of  each  population  is  known.  A  selection  rule 
is  given  based  on  the  sample  quantiles,  where  the  samples 
are  drawn  without  replacement.  A  formula  for  the  minimum 
probability  of  a  correct  selection  for  the  given  rule,  for  a 
certain  configuration  of  the  population  a-quantiles ,  is  given 
in  terms  of  the  sample  numbers^ 
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1.  Introduction.  This  paper  deals  with  the  problem 
of  selecting  a  population  from  amongst  several  populations 
whose  distribution  functions  are  unknown.  In  many  situations 
there  are  certain  parameters  of  the  distributions  which  are 
of  interest,  such  as,  the  mean  and  variance.  Then  the  popula¬ 
tions  are  ranked  according  to  the  values  of  those  parameters 
and  the  selection  is  based  on  their  estimated  values.  It 
may  be  required  to  select,  for  example,  the  population  with 
the  smallest  variance  or  the  population  with  the  largest  mean. 
Whereas,  the  mean,  variance,  a  scale  parameter  and  a  location 
parameter  are  parameters  of  general  interest,  it  is  sometimes 
justifiable  to  rank  the  populations  according  to  the  quantile 
of  a  given  order,  especially  when  the  standard  parameters  do 
not  exist.  Rizvi,  Sobel  and  Woodworth  (1968)  have  discussed 
a  comparison  of  populations  in  terms  of  a-quantiles.  Sobel 
(1967)  and  Rizvi  and  Sobel  (1967)  have  considered  the  problem 
of  selecting  the  population  with  the  largest  a-quantile  and 
the  problem  of  selecting  a  subset  of  k  >  2  populations  which 
includes  the  population  with  the  largest  a-quantile. 

The  papers  mentioned  above  pertain  to  large  populations. 
In  this  paper  we  consider  the  problem  of  selecting  the  popula¬ 
tion  with  the  largest  a-quantile  from  k  _>  2  finite  populations 
The  populations  are  sampled  by  the  method  of  sampling  without 
replacement.  A  practical  situation  in  which  the  problem  may 
arise  is  illustrated  by  the  following  example:  Suppose  that 
the  Department  of  Education  in  a  certain  state  is  interested 
in  selecting  one  of  several  schools  in  an  area  to  implement 


2. 


a  special  training  program  for  exceptionally  bright  students. 

As  the  program  involves  only  exceptionally  bright  students, 
the  school  with  the  largest  75th  quantile  score  on  a  special 
merit  examination  (SME)  may  be  selected  for  the  training  program. 

A  random  sample  of  n  students  is  taken  from  each  school  and 
the  selected  students  are  given  the  SME.  The  selection  of 
the  school  for  the  special  training  program  would  depend  on 
the  SME  scores. 

The  selection  problem  is  formulated  as  follows:  Let 
^  denote  k  ^  2  finite  populations  and  let  denote 
the  size  of  i  «  l,...,k.  The  numbers  N^,...,Njc  are  assumed 
to  be  known.  The  members  of  each  population  are  ranked  accord¬ 
ing  to  some  characteristic  value.  Let  X.  denote  the  m.th 

smallest  value  of  the  elements  of  ir^for  m^  *  aJT,  where  a  is 
a  given  positive  fraction.  Then  represents  the  a-quantile 

of  Tn.  It  is  assumed  that  aN^  is  integer  valued  for  each 
i  *  l,...,k.  We  shall  call  the  population  associated  with 
the  largest  value  of  m  as  the  best  population.  It  is  required 
to  select  the  best  population. 

Suppose  that  a  sample  of  n^  <  elements  is  drawn  from  ir^ 
without  replacement.  Let  denote  the  sample  a-quantile,  that 
is,  the  an^  smallest  value  in  the  sample.  It  is  assumed  for  sim¬ 
plicity  that  an^  is  integer  values  for  each  i  »  l,...,k.  We  select 
the  population  associated  with  the  largest  value  of  for  the  best 
population.  We  shall  call  this  procedure  S. 

Let  e  be  a  positive  fraction,  such  that  e  <  a  <  1-e.  We 
assume  for  simplicity  that  eN^  is  integer  valued  for  each  i«l,...,k. 
Let  C  denote  a  configuration  of  population  quantiles  given  by 
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(l.i)  Xj , (a+e)Nj  <  Xi,(a-e)Ni  *  ^  1+1’”*,k 

when  it ^  is  the  best  population,  i*l,...,k.  This  is  called 
a  preference  zone.  It  is  required  that  the  probability  of 

a  correct  selection  (PCS)  for  the  procedure  S  should  satisfy 

the  relation  PCS  _>  P*  in  the  preference  zone,  where  P*  is 
a  pre-assigned  number,  such  that  ^  <  P*  <  1. 

The  value  of  the  PCS  for  the  procedure  S  depends  on  the 

sample  numbers  n^,...,n^.  In  the  following  section  we  derive 

a  formula  for  the  minimum  value  of  the  PCS  inside  C£.  The 
sample  numbers  needed  to  satisfy  the  probability  requirement 
for  S  are  determined  from  the  given  formula. 

2 ,  Procedure  S. 

Suppose  that  ir^  is  the  best  population.  It  is  easy  to 
see  that  the  probability  of  a  correct  selection  for  the  proce¬ 
dure  S,  given  the  configuration  C£,  is  minimized  when 


(2,1)  Xi,(a-e)Ni-l  -  Xj,l  -  Xj,(ot+e)N.  <  Xi ,  (a-e)N.  -  Xi  ,N.- 

Xj  > (a+e)Nj  +1 

for  j  *  1, . . . ,i-l,i+l, . . . ,k.  Therefore,  given  C£,  the  minimum 
probability  of  a  correct  selection  is  equal  to 


(2.2)  min  P  {S.  <  S. ,  j»l 
i»l,...,k  J  1 


i“l>  i+l,...,k  |  is  best 
population} 


min  (Rj, . . . ,Rk) 


where 


(2.3)  R.  »  l  ((<x"e^Ni‘l)((l;at®)Ni+l)/(^i)  x 


r=0 


3-1 


it 


V  (C-jOHjjtt1— «)1Ijj/(||J} 


.  ,k  r*otnj 


n^r  3 


-  Q(ani-l);ni,(a-e)Ni-l,N.)  x 

n  (l-Q(an.-l;n.,(o+e)N..N.)) 

j  =1 , . . . , i- 1 , i+1 , . . • ,k  •'  ^  ^  ^ 

and 

(2.4)  Q(x;n,M,M)  =  f  A(„I?)/Q 

r*0  1  1  11 

denotes  the  cumulative  distribution  function  of  the  hypergeometric 
distribution. 

The  value  of  the  PCS  given  by  (2.2)  and  (2.3)  can  be 
computed  from  the  tables  of  the  hypergeometric  distribution 
prepared  by  Lieberman  and  Owen  (1961).  If  is  large  and 
n^  is  small  compared  to  for  each  i,  then  R^  is  approximately 
given  by 


(2.5) 


1  -  1  a+e  l  l  a+e  3  3 


where 


Ip(a»b)  “  Ha 


/  xa‘1(l-x)b’1  dx 

0 


denotes  the  imcomplete  beta  function.  On  the  other  hand, 
if  n^  and  are  large  for  each  i,  such  that  n^/N^  -*■  say, 
where  £^<2 (1-max  then  R^^  is  approximately  given  by 
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(see  Wise  (1954)) 


(2.6)  Ri 


k 

((l-a)n.+l,  an.)  II  I  (an 

1  1  wj  J 


(l-a)nj  +1) 


where 


w. 

X 


1 


a 


+ 


2e 

2-h 


Note  that  (2.6)  reduces  to  (2.5)  for  *  ...  *  =  0. 

If  the  k  populations  are  of  equal  size  N,  say,  we  take 
n^  *  n2  *  • . .  *  n^  »  n,  say.  Then  from  (2.2)  and  (2.3)  the 
minimum  value  of  the  PCS  is  equal  to 


(2.7)  (“"i1  ((«-ON-l)(U-a!ON*l))x 

r*  n 

c  i  ((a+e)N)  (a‘®:j)N))k’1/cci}))k. 

r-an  r 

Table  I  below  gives  the  minimum  value  of  n  for  which  PCS  >  P* 
for  a-. 50,  e-.05,  .10,  P*«.75,  .95,  .99,  k-l(l)5  and  N  - 
30,  40,  50,  100,  200,  400.  The  minimum  value  of  the  PCS  for  the 
given  n  is  also  shown  in  the  table.  It  is  seen  from  the  table 
that  in  some  cases  there  is  considerable  difference  between  the 
minimum  value  of  the  PCS  and  the  prescribed  value  P*,  due  to  the 
discrete  nature  of  the  hypergeometric  distribution.  However, 
the  descrepancy  is  reduced  for  large  N. 
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In  practice,  the  given  populations  would  vary  in  size.  There¬ 
fore,  we  consider  the  value  of  R^,  given  by  (2.6).  It  is  easily 
shown  that  for  large  m  the  beta  integral  I  (am,  (l-a)m+l)  is  in- 

r 

creasing  (decreasing)  in  m  for  p  <  (>)  a.  Therefore,  is  an 

increasing  function  of  n^,...,n^  when  the  sample  numbers  are  large. 

Thus  a  smallest  sample  number  can  prescribed  for  each  population 

to  meet  the  probability  requirement  for  the  procedure  S  in  the 

general  case  when  the  population  size  varies.  In  this  case  it 

would  be  interesting  to  find  an  optimal  distribution  of  the  sample 

k 

numbers,  given  Y  n.  =  n,  say.  This  is  an  exercise  in  linear 
k»l  1 

programming,  where  we  maximize  a  function  f (£^, . . . ,C^)  subject 

k 

to  the  constraints  Y  £.N.  =  n  and  0  <  £.  <  1,  i  =  l,...,k. 

i  x  1 

Here  f (£^,. . . ,5^)  *  min(R^ , . . . ,R^) ,  where  R^  is  given  by  (2.6) 
with  the  substitution  for  n^,  Z  *  l,...,k. 

Remark  1.  We  observe  that  the  value  of  the  PCS  given  by  (2.2) 
where  R^  is  given  by  (2.5)  represents  the  minimum  probability  of 
a  correct  selection  when  the  samples  are  drawn  with  replacement. 

Remark  2 .  It  is  seen  from  (2.1)  that  the  value  of  the  PCS 
is  equal  to  1  if 


f T  >  -  f,  £f) 


1,. . . , k . 


Therefore,  a  correct  selection  is  obtained  with  probability  1 
with  n^  <  for  each  i. 
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Table  I  -  Minimum  Value  of  n  for  PCS 
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